46 research outputs found
Variable Selection for Doubly Robust Causal Inference
Confounding control is crucial and yet challenging for causal inference based
on observational studies. Under the typical unconfoundness assumption,
augmented inverse probability weighting (AIPW) has been popular for estimating
the average causal effect (ACE) due to its double robustness in the sense it
relies on either the propensity score model or the outcome mean model to be
correctly specified. To ensure the key assumption holds, the effort is often
made to collect a sufficiently rich set of pretreatment variables, rendering
variable selection imperative. It is well known that variable selection for the
propensity score targeted for accurate prediction may produce a variable ACE
estimator by including the instrument variables. Thus, many recent works
recommend selecting all outcome predictors for both confounding control and
efficient estimation. This article shows that the AIPW estimator with variable
selection targeted for efficient estimation may lose the desirable double
robustness property. Instead, we propose controlling the propensity score model
for any covariate that is a predictor of either the treatment or the outcome or
both, which preserves the double robustness of the AIPW estimator. Using this
principle, we propose a two-stage procedure with penalization for variable
selection and the AIPW estimator for estimation. We show the proposed procedure
benefits from the desirable double robustness property. We evaluate the
finite-sample performance of the AIPW estimator with various variable selection
criteria through simulation and an application
Domain-independent Punctuation and Segmentation Insertion
Punctuation and segmentation is crucial in spoken language translation, as it has a strong impact to translation performance. However, the impact of rare or unknown words in the performance of punctuation and segmentation insertion has not been thoroughly studied. In this work, we simulate various degrees of domain-match in testing scenario and investigate their impact to the punctuation insertion task. We explore three rare word generalizing schemes using part-of-speech (POS) tokens. Experiments show that generalizing rare and unknown words greatly improves the punctuation insertion performance, reaching up to 8.8 points of improvement in F-score when applied to the out-of-domain test scenario. We show that this improvement in punctuation quality has a positive impact on a following machine translation (MT) performance, improving it by 2 BLEU points
On optimum sensing time over fading channels for Cognitive Radio system
Cognitive Radio (CR) is widely expected to be the next Big Bang in wireless communications. In a CR network, the secondary users are allowed to utilize the frequency bands of primary users when these bands are not currently being used. For this, the secondary user should be able to detect the presence of the primary user. Therefore, spectrum sensing is of significant importance in CR networks.
In this thesis, we consider the antenna selection problem over fading channels to optimize the trade off between probability of detection and power efficiency of CR systems. We formulate a target function consists of detection probability and power efficiency mathematically, and use energy detection sensing scheme to prove that the formulated problem indeed has one optimal sensing time which yields the highest target function value.
Two modelling techniques are used to model the Rayleigh fading channels; one without correlations and one with correlations on temporal and frequency domains. For each model, we provide two scenarios for average SNRs of each channel. In the first scenario, the channels have distinguished level of average SNRs. The second scenario provides a condition in which the channels have similar average SNRs. The antenna selection criterion is based on the received signal strength; each simulation is compared with the worst case simulation, where the antennas are selected randomly.
Numerical results have shown that the proposed antenna selection criterion enhanced the detection probability as well as it shortened the optimal sensing time. The target function achieved the higher value while maintaining 0.9 detection probability compared to the worst case simulation. The optimal sensing time is varied by other parameters, such as weighting factor of the target function
Eccentric Exercise in Treatment of Patellar Tendinopathy in High Level Basketball Players. A Randomized Clinical Trial.
Chronic patellar tendinopathy is a common pathology in sporting population. To date, there is no agreed upon protocol as election treatment. Eccentric exercises have been used with satisfactory outcomes (3). The purpose of this trial was to compare the effects of two eccentric exercise protocols
Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding
Conversational AI systems such as Alexa need to understand defective queries
to ensure robust conversational understanding and reduce user friction. These
defective queries often arise from user ambiguities, mistakes, or errors in
automatic speech recognition (ASR) and natural language understanding (NLU).
Personalized query rewriting is an approach that focuses on reducing defects
in queries by taking into account the user's individual behavior and
preferences. It typically relies on an index of past successful user
interactions with the conversational AI. However, unseen interactions within
the user's history present additional challenges for personalized query
rewriting. This paper presents our "Collaborative Query Rewriting" approach,
which specifically addresses the task of rewriting new user interactions that
have not been previously observed in the user's history. This approach builds a
"User Feedback Interaction Graph" (FIG) of historical user-entity interactions
and leverages multi-hop graph traversal to enrich each user's index to cover
future unseen defective queries. The enriched user index is called a
Collaborative User Index and contains hundreds of additional entries. To
counteract precision degradation from the enlarged index, we add additional
transformer layers to the L1 retrieval model and incorporate graph-based and
guardrail features into the L2 ranking model.
Since the user index can be pre-computed, we further investigate the
utilization of a Large Language Model (LLM) to enhance the FIG for user-entity
link prediction in the Video/Music domains. Specifically, this paper
investigates the Dolly-V2 7B model. We found that the user index augmented by
the fine-tuned Dolly-V2 generation significantly enhanced the coverage of
future unseen user interactions, thereby boosting QR performance on unseen
queries compared with the graph traversal only approach
X-Eval: Generalizable Multi-aspect Text Evaluation via Augmented Instruction Tuning with Auxiliary Evaluation Aspects
Natural Language Generation (NLG) typically involves evaluating the generated
text in various aspects (e.g., consistency and naturalness) to obtain a
comprehensive assessment. However, multi-aspect evaluation remains challenging
as it may require the evaluator to generalize to any given evaluation aspect
even if it's absent during training. In this paper, we introduce X-Eval, a
two-stage instruction tuning framework to evaluate the text in both seen and
unseen aspects customized by end users. X-Eval consists of two learning stages:
the vanilla instruction tuning stage that improves the model's ability to
follow evaluation instructions, and an enhanced instruction tuning stage that
exploits the connections between fine-grained evaluation aspects to better
assess text quality. To support the training of X-Eval, we collect
AspectInstruct, the first instruction tuning dataset tailored for multi-aspect
NLG evaluation spanning 27 diverse evaluation aspects with 65 tasks. To enhance
task diversity, we devise an augmentation strategy that converts human rating
annotations into diverse forms of NLG evaluation tasks, including scoring,
comparison, ranking, and Boolean question answering. Extensive experiments
across three essential categories of NLG tasks: dialogue generation,
summarization, and data-to-text coupled with 21 aspects in meta-evaluation,
demonstrate that our X-Eval enables even a lightweight language model to
achieve a comparable if not higher correlation with human judgments compared to
the state-of-the-art NLG evaluators, such as GPT-4.Comment: 17 pages, 5 figures, 14 table